A Probabilistic NF2 Relational Algebra for Integrated Information Retrieval and Database Systems

نویسنده

  • Norbert Fuhr
چکیده

The integration of information retrieval (IR) and database systems requires a data model which allows for modelling documents as entities, representing uncertainty and vagueness and performing uncertain inference. For this purpose, we present a probabilistic data model based on relations in non-rst-normal-form (NF2). Here, tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. Thus, the set of weighted index terms of a document are represented as a probabilistic subrelation. In a similar way, imprecise attribute values are modelled as a set-valued attribute. We redeene the relational operators for this type of relations such that the result of each operator is again a probabilistic NF2 relation, where the weight of a tuple gives the probability that this tuple belongs to the result. By ordering the tuples according to decreasing probabilities, the model yields a ranking of answers like in most IR models. This eeect also can be used for typical database queries involving imprecise attribute values as well as for combinations of database and IR queries.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Probabilistic NF2 Relational Algebra for Imprecision in Databases

We present a probabilistic data model which is based on relations in non-rst-normal-form (NF2). Here, tuples are assigned probabilistic weights giving the probability that a tuple belongs to a relation. This way, imprecise attribute values are modelled as a probabilistic subrelation. For information retrieval, the set of weighted index terms of a document can be represented in the same way, thu...

متن کامل

Data Structures for an Integrated Data Base Management and Information Retrieval System

New applications like office information systems need interfaces to data bases which integrate classical data manipulation with management and retrieval of textual (“unformatted”) data. The relational data model is widely accepted as a high level interface to classical (“formatted”) data management. It turns out, however, to be inconvenient for handling even simple data structures as commonly u...

متن کامل

Models for Integrated Information Retrieval and Database Systems

In this paper, we show that there is a mismatch between information retrieval (IR) and database (DB) concepts, and we devise solutions for this problem. DB oriented approaches have to distinguish between the logical and the content structure of objects, and should also consider the layout structure. Data independence—not regarded in IR before—can be achieved by using the notion of vague predica...

متن کامل

Logical and Conceptual Models for the Integration of Information Retrieval and Database Systems

We present two new approaches to the problem of integrating information retrieval (IR) and database (DB) systems. On the logical level, IR is based on uncertain inference, which is a generalization to the certain inference process employed in DB systems. As an implementation of this concept, we present a probabilistic relational algebra. On the conceptual level, we distinguish between the logic...

متن کامل

Bridging Information Retrieval and Databases

For bridging the gap between information retrieval (IR) and databases (DB), this article focuses on the logical view. We claim that IR should adopt three major concepts from DB, namely inference, vague predicates and expressive query languages. By regarding IR as uncertain inference, probabilistic versions of relational algebra and Datalog yield very powerful inference mechanisms for IR as well...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996